Audio-based feedback for head-mountable device

ABSTRACT

A head-mountable device can include multiple microphones for directional audio detection. The head-mountable device can also include a speaker for audio output and/or a display for visual output. The head-mountable device can be configured to provide visual outputs based on audio inputs by displaying an indicator on a display based on a location of a source of a sound. The head-mountable device can be configured to audio outputs based on audio inputs by modifying an audio output of the speaker based on a detected sound and a target characteristic. Such characteristics can be based on a direction of a gaze of the user, as detected by an eye sensor.

CROSS-REFERENCE TO RELATED APPLICATION

This application is a continuation of U.S. application Ser. No.16/935,052, entitled “AUDIO-BASED FEEDBACK FOR HEAD-MOUNTABLE DEVICE,”filed Jul. 21, 2020, which claims the benefit of U.S. ProvisionalApplication No. 62/889,473, entitled “AUDIO-BASED FEEDBACK FORHEAD-MOUNTABLE DEVICE,” filed Aug. 20, 2019, the entirety of which areincorporated herein by reference.

TECHNICAL FIELD

The present description relates generally to head-mountable devices,and, more particularly, to audio-based feedback for head-mountabledevices.

BACKGROUND

A head-mountable device can be worn by a user to display visualinformation within the field-of-view of the user. The head-mountabledevice can be used as a virtual reality (VR) system, an augmentedreality (AR) system, and/or a mixed reality (MR) system. A user mayobserve outputs provided by the head-mountable device, such as visualinformation provided on a display. The display can optionally allow auser to observe an environment outside of the head-mountable device.Other outputs provided by the head-mountable device can include speakeroutput and/or haptic feedback. A user may further interact with thehead-mountable device by providing inputs for processing by one or morecomponents of the head-mountable device. For example, the user canprovide tactile inputs, voice commands, and other inputs while thedevice is mounted to the user's head.

BRIEF DESCRIPTION OF THE DRAWINGS

Certain features of the subject technology are set forth in the appendedclaims. However, for purpose of explanation, several embodiments of thesubject technology are set forth in the following figures.

FIG. 1 illustrates a perspective view of a head-mountable device on auser, according to some embodiments of the present disclosure.

FIG. 2 illustrates a block diagram of a head-mountable device, inaccordance with some embodiments of the present disclosure.

FIG. 3 illustrates a top view of a user wearing a head-mountable deviceand a source of a sound within a field-of-view of the user, according tosome embodiments of the present disclosure.

FIG. 4 illustrates a view of the head-mountable device of FIG. 3providing a visual output, according to some embodiments of the presentdisclosure.

FIG. 5 illustrates a top view of a user wearing a head-mountable deviceand a source of a sound that is outside a field-of-view of the user,according to some embodiments of the present disclosure.

FIG. 6 illustrates a view of the head-mountable device of FIG. 5providing a visual output, according to some embodiments of the presentdisclosure.

FIG. 7 illustrates a method of operating a head-mountable device toprovide audio-based feedback with a display of the head-mountabledevice, according to some embodiments of the present disclosure.

FIG. 8 illustrates a method of operating a head-mountable device toprovide audio-based feedback with a speaker of the head-mountabledevice, according to some embodiments of the present disclosure.

FIG. 9 illustrates a method of operating a head-mountable device toprovide audio-based feedback with a speaker of the head-mountabledevice, according to some embodiments of the present disclosure.

DETAILED DESCRIPTION

The detailed description set forth below is intended as a description ofvarious configurations of the subject technology and is not intended torepresent the only configurations in which the subject technology may bepracticed. The appended drawings are incorporated herein and constitutea part of the detailed description. The detailed description includesspecific details for the purpose of providing a thorough understandingof the subject technology. However, it will be clear and apparent tothose skilled in the art that the subject technology is not limited tothe specific details set forth herein and may be practiced without thesespecific details. In some instances, well-known structures andcomponents are shown in block diagram form in order to avoid obscuringthe concepts of the subject technology.

Head-mountable devices, such as head-mountable displays, headsets,visors, smartglasses, head-up display, etc., can perform a range offunctions that are managed by the components (e.g., sensors, circuitry,and other hardware) included with the wearable device. A head-mountabledevice can capture various types of inputs (e.g., visual, audio,tactile, etc.) from an environment and/or the user. A head-mountabledevice can also provide various types of outputs (e.g., visual, audio,tactile, etc.) to a user and/or the environment.

In particular, a head-mountable device can be provided with multiplemicrophones for capturing audio information (e.g., sounds) from multiplesources that are located in different directions with respect to thehead-mountable device. Multiple microphones distributed across thehead-mountable device can provide directional audio detection. Thehead-mountable device can use the data collected by the microphones toprovide visual and/or audio outputs to the user. For example, thedetected audio inputs can be rendered with visual outputs by providingindicators directing the user to the source of the sound. This can allowthe user to correctly and readily identify the location of the source,even when the user is not readily able to hear the sound independentlyof the head-mountable device. By further example, the detected audioinputs can be rendered with audio outputs that emphasize (e.g., amplify)certain sounds over others to help the user distinguish betweendifferent sounds.

Systems of the present disclosure can include a head-mountable devicewith multiple microphones. The head-mountable device can also include aspeaker for audio output and/or a display for visual output. Thehead-mountable device can be configured to provide visual outputs basedon audio inputs by displaying an indicator on a display based on alocation of a source of a sound. The head-mountable device can beconfigured to audio outputs based on audio inputs by modifying an audiooutput of the speaker based on a detected sound and a targetcharacteristic. Such characteristics can be based on a direction of agaze of the user, as detected by an eye sensor.

These and other embodiments are discussed below with reference to FIGS.1-9 . However, those skilled in the art will readily appreciate that thedetailed description given herein with respect to these Figures is forexplanatory purposes only and should not be construed as limiting.

According to some embodiments, for example as shown in FIG. 1 , ahead-mountable device 100 includes a frame 110 that is worn on a head ofa user. The frame 110 can be positioned in front of the eyes of a userto provide information within a field-of-view of the user. The frame 110can provide nose pads or another feature to rest on a user's nose. Theframe 110 can be supported on a user's head with the securement element120. The securement element 120 can wrap or extend along opposing sidesof a user's head. The securement element 120 can include earpieces forwrapping around or otherwise engaging or resting on a user's ears. Itwill be appreciated that other configurations can be applied forsecuring the head-mountable device 100 to a user's head. For example,one or more bands, straps, belts, caps, hats, or other components can beused in addition to or in place of the illustrated components of thehead-mountable device 100. By further example, the securement element120 can include multiple components to engage a user's head.

The frame 110 can provide structure around a peripheral region thereofto support any internal components of the frame 110 in their assembledposition. For example, the frame 110 can enclose and support variousinternal components (including for example integrated circuit chips,processors, memory devices and other circuitry) to provide computing andfunctional operations for the head-mountable device 100, as discussedfurther herein. Any number of components can be included within and/oron the frame 110 and/or the securement element 120.

The head-mountable device 100 can include multiple microphones 130distributed on the frame 110 and/or the securement element 120. Themicrophones 130 can be spatially distributed evenly or unevenly. Themicrophones 130 can be positioned at various portions, such as on afront, rear, left, right, top, and/or bottom side of the head-mountabledevice 100 (e.g., including the frame 110 and/or the securement element120). The microphones 130 can be omnidirectional or directional.Detection of sound source direction can be performed with one or more ofa variety of microphone types, as discussed further herein.

One or more of the microphones 130 can be or include a directionalmicrophone that is configured to be most sensitive to sound in aparticular direction. Such directionality can be provided based onstructural features of the microphone 130 and/or surrounding structures.For example, one or more of the microphones 130 can include or beadjacent to a parabolic reflector that collects and focuses sound wavesfrom a particular direction onto a transducer. Based on the knowndirectionality relative to other portions of the head-mountable device100, sound received by such a microphone 130 can be attributed to asource in a particular direction with respect to the head-mountabledevice 100. Different microphones 130 can be oriented with differentdirectionalities to provide an array of coverage that captures soundsfrom a variety of (e.g., all) directions.

An array of multiple microphones can be operated to isolate a soundsource and reject ambient noise and reverberation. For example, multiplemicrophones can be operated to perform beamforming by combining soundsfrom two or more microphones to allow preferential capture of soundscoming from certain directions. In a delay-and-sum beamformer, soundsfrom each microphone are delayed relative to sounds from the othermicrophones, and the delayed signals are added. The amount of delaydetermines the beam angle (e.g., the angle in which the arraypreferentially “listens”). When a sound arrives from this angle, thesound signals from the multiple phones are added constructively. Theresulting sum is stronger, and the sound is received relatively well.When a sound arrives from another angle, the delayed signals from thevarious microphones add destructively (e.g., with positive and negativeparts of the sound waves canceling out to some degree) and the sum isnot as loud as an equivalent sound arriving from the beam angle. Forexample, if a sound arrives at a microphone on the right before itenters a microphone on the left, then it can be determined that thesound source is to the right of the microphone array. During soundcapturing, a controller (e.g., processor) can “aim” a capturing beam ina direction of the sound source. Beamforming allows a microphone arrayto simulate a directional microphone pointing toward the sound source.The directivity of the microphone array reduces the amount of capturedambient noises and reverberated sound as compared to a singlemicrophone. This may provide a clearer representation of a sound source.A beamforming microphone array may made up of distributedomnidirectional microphones linked to a processor that combines theseveral inputs into an output with a coherent form. Arrays may be formedusing numbers of closely spaced microphones. Given a fixed physicalrelationship in space between the different individual microphonetransducer array elements, simultaneous digital signal processor (DSP)processing of the signals from each of the individual microphones in thearray can create one or more “virtual” microphones.

The head-mountable device 100 can include one or more speakers 212.Where multiple speakers are provided, the speakers can be directed toeach of a user's ears to provide stereo sound. Other speakerarrangements are contemplated, including surround sound. Additionally oralternatively, the head-mountable device 100 can be operably connectedto speakers that are directed to, near, or in a user's ears.

The frame 110 can include and/or support a display 190 that providesvisual output for viewing by a user wearing the head-mountable device100. For example, one or more optical modules can each provide a display190 that is positioned on an inner side of the frame 110. As usedherein, an inner side of a portion of a head-mountable device is a sidethat faces toward the user and/or away from the external environment.For example, a pair of optical modules can be provided, where eachoptical module is movably positioned to be within the field-of-view ofeach of a user's two eyes. Each optical module can be adjusted to alignwith a corresponding eye of the user. For example, each optical modulecan be moved along one or more axes until a center of each opticalmodule is aligned with a center of the corresponding eye. Accordingly,the distance between the optical modules can be set based on aninterpupillary distance of the user.

The frame 110 can include and/or support one or more cameras 150. Thecameras 150 can be positioned on or near an outer side of the frame 110to capture images of views external to the head-mountable device 100. Asused herein, an outer side of a portion of a head-mountable device is aside that faces away from the user and/or towards an externalenvironment. The captured images can be visually output by the display190 to the user and/or stored for any other purpose. Accordingly, thedisplay 190 is able to accurately reproduce, simulate, or augment a viewbased on a view captured by the camera 150.

The display 190 and accompanying components can transmit light from aphysical environment (e.g., as captured by the camera 150) for viewingby the user. Such a display 190 and/or accompanying components caninclude optical properties, such as lenses for vision correction basedon incoming light from the physical environment. Additionally oralternatively, a display 190 can provide information within afield-of-view of the user. Such information can be provided to theexclusion of a view of a physical environment or in addition to (e.g.,overlaid with) a physical environment.

A physical environment refers to a physical world that people can senseand/or interact with without aid of electronic systems. Physicalenvironments, such as a physical park, include physical articles, suchas physical trees, physical buildings, and physical people. People candirectly sense and/or interact with the physical environment, such asthrough sight, touch, hearing, taste, and smell.

In contrast, a computer-generated reality (CGR) environment refers to awholly or partially simulated environment that people sense and/orinteract with via an electronic system. In CGR, a subset of a person'sphysical motions, or representations thereof, are tracked, and, inresponse, one or more characteristics of one or more virtual objectssimulated in the CGR environment are adjusted in a manner that comportswith at least one law of physics. For example, a CGR system may detect aperson's head turning and, in response, adjust graphical content and anacoustic field presented to the person in a manner similar to how suchviews and sounds would change in a physical environment. In somesituations, (e.g., for accessibility reasons), adjustments tocharacteristic(s) of virtual object(s) in a CGR environment may be madein response to representations of physical motions (e.g., vocalcommands).

A person may sense and/or interact with a CGR object using any one oftheir senses, including sight, sound, touch, taste, and smell. Forexample, a person may sense and/or interact with audio objects thatcreate 3D or spatial audio environment that provides the perception ofpoint audio sources in 3D space. In another example, audio objects mayenable audio transparency, which selectively incorporates ambient soundsfrom the physical environment with or without computer-generated audio.In some CGR environments, a person may sense and/or interact only withaudio objects.

Examples of CGR include virtual reality and mixed reality.

A virtual reality (VR) environment refers to a simulated environmentthat is designed to be based entirely on computer-generated sensoryinputs for one or more senses. A VR environment comprises a plurality ofvirtual objects with which a person may sense and/or interact. Forexample, computer-generated imagery of trees, buildings, and avatarsrepresenting people are examples of virtual objects. A person may senseand/or interact with virtual objects in the VR environment through asimulation of the person's presence within the computer-generatedenvironment, and/or through a simulation of a subset of the person'sphysical movements within the computer-generated environment.

In contrast to a VR environment, which is designed to be based entirelyon computer-generated sensory inputs, a mixed reality (MR) environmentrefers to a simulated environment that is designed to incorporatesensory inputs from the physical environment, or a representationthereof, in addition to including computer-generated sensory inputs(e.g., virtual objects). On a virtuality continuum, a mixed realityenvironment is anywhere between, but not including, a wholly physicalenvironment at one end and virtual reality environment at the other end.

In some MR environments, computer-generated sensory inputs may respondto changes in sensory inputs from the physical environment. Also, someelectronic systems for presenting an MR environment may track locationand/or orientation with respect to the physical environment to enablevirtual objects to interact with real objects (that is, physicalarticles from the physical environment or representations thereof). Forexample, a system may account for movements so that a virtual treeappears stationery with respect to the physical ground.

Examples of mixed realities include augmented reality and augmentedvirtuality.

An augmented reality (AR) environment refers to a simulated environmentin which one or more virtual objects are superimposed over a physicalenvironment, or a representation thereof. For example, an electronicsystem for presenting an AR environment may have a transparent ortranslucent display through which a person may directly view thephysical environment. The system may be configured to present virtualobjects on the transparent or translucent display, so that a person,using the system, perceives the virtual objects superimposed over thephysical environment. Alternatively, a system may have an opaque displayand one or more imaging sensors that capture images or video of thephysical environment, which are representations of the physicalenvironment. The system composites the images or video with virtualobjects, and presents the composition on the opaque display. A person,using the system, indirectly views the physical environment by way ofthe images or video of the physical environment, and perceives thevirtual objects superimposed over the physical environment. As usedherein, a video of the physical environment shown on an opaque displayis called “pass-through video,” meaning a system uses one or more imagesensor(s) to capture images of the physical environment, and uses thoseimages in presenting the AR environment on the opaque display. Furtheralternatively, a system may have a projection system that projectsvirtual objects into the physical environment, for example, as ahologram or on a physical surface, so that a person, using the system,perceives the virtual objects superimposed over the physicalenvironment.

An augmented reality environment also refers to a simulated environmentin which a representation of a physical environment is transformed bycomputer-generated sensory information. For example, in providingpass-through video, a system may transform one or more sensor images toimpose a select perspective (e.g., viewpoint) different from theperspective captured by the imaging sensors. As another example, arepresentation of a physical environment may be transformed bygraphically modifying (e.g., enlarging) portions thereof, such that themodified portion may be representative but not photorealistic versionsof the originally captured images. As a further example, arepresentation of a physical environment may be transformed bygraphically eliminating or obfuscating portions thereof.

An augmented virtuality (AV) environment refers to a simulatedenvironment in which a virtual or computer generated environmentincorporates one or more sensory inputs from the physical environment.The sensory inputs may be representations of one or more characteristicsof the physical environment. For example, an AV park may have virtualtrees and virtual buildings, but people with faces photorealisticallyreproduced from images taken of physical people. As another example, avirtual object may adopt a shape or color of a physical article imagedby one or more imaging sensors. As a further example, a virtual objectmay adopt shadows consistent with the position of the sun in thephysical environment.

There are many different types of electronic systems that enable aperson to sense and/or interact with various CGR environments. Examplesinclude head-mountable systems, projection-based systems, heads-updisplays (HUDs), vehicle windshields having integrated displaycapability, windows having integrated display capability, displaysformed as lenses designed to be placed on a person's eyes (e.g., similarto contact lenses), headphones/earphones, speaker arrays, input systems(e.g., wearable or handheld controllers with or without hapticfeedback), smartphones, tablets, and desktop/laptop computers. Ahead-mountable system may have one or more speaker(s) and an integratedopaque display. Alternatively, a head-mountable system may be configuredto accept an external opaque display (e.g., a smartphone). Thehead-mountable system may incorporate one or more imaging sensors tocapture images or video of the physical environment, and/or one or moremicrophones to capture audio of the physical environment. Rather than anopaque display, a head-mountable system may have a transparent ortranslucent display. The transparent or translucent display may have amedium through which light representative of images is directed to aperson's eyes. The display may utilize digital light projection, OLEDs,LEDs, uLEDs, liquid crystal on silicon, laser scanning light source, orany combination of these technologies. The medium may be an opticalwaveguide, a hologram medium, an optical combiner, an optical reflector,or any combination thereof. In one embodiment, the transparent ortranslucent display may be configured to become opaque selectively.Projection-based systems may employ retinal projection technology thatprojects graphical images onto a person's retina. Projection systemsalso may be configured to project virtual objects into the physicalenvironment, for example, as a hologram or on a physical surface.

Referring now to FIG. 2 , components of the head-mountable device can beoperably connected to provide the performance described herein. FIG. 2shows a simplified block diagram of an illustrative head-mountabledevice 100 in accordance with one embodiment of the invention. It willbe appreciated that components described herein can be provided oneither or both of a frame and/or a securement element of thehead-mountable device 100.

As shown in FIG. 2 , the head-mountable device 100 can include acontroller 270 with one or more processing units that include or areconfigured to access a memory 218 having instructions stored thereon.The instructions or computer programs may be configured to perform oneor more of the operations or functions described with respect to thehead-mountable device 100. The controller 270 can be implemented as anyelectronic device capable of processing, receiving, or transmitting dataor instructions. For example, the controller 270 may include one or moreof: a microprocessor, a central processing unit (CPU), anapplication-specific integrated circuit (ASIC), a digital signalprocessor (DSP), or combinations of such devices. As described herein,the term “processor” is meant to encompass a single processor orprocessing unit, multiple processors, multiple processing units, orother suitably configured computing element or elements.

The memory 218 can store electronic data that can be used by thehead-mountable device 100. For example, the memory 218 can storeelectrical data or content such as, for example, audio and video files,documents and applications, device settings and user preferences, timingand control signals or data for the various modules, data structures ordatabases, and so on. The memory 218 can be configured as any type ofmemory. By way of example only, the memory 218 can be implemented asrandom access memory, read-only memory, Flash memory, removable memory,or other types of storage elements, or combinations of such devices.

The head-mountable device 100 can further include a display 190 fordisplaying visual information for a user. The display 190 can providevisual (e.g., image or video) output. The display 190 can be or includean opaque, transparent, and/or translucent display. The display 190 mayhave a transparent or translucent medium through which lightrepresentative of images is directed to a user's eyes. The display 190may utilize digital light projection, OLEDs, LEDs, uLEDs, liquid crystalon silicon, laser scanning light source, or any combination of thesetechnologies. The medium may be an optical waveguide, a hologram medium,an optical combiner, an optical reflector, or any combination thereof.In one embodiment, the transparent or translucent display may beconfigured to become opaque selectively. Projection-based systems mayemploy retinal projection technology that projects graphical images ontoa person's retina. Projection systems also may be configured to projectvirtual objects into the physical environment, for example, as ahologram or on a physical surface. The head-mountable device 100 caninclude an optical subassembly 214 configured to help optically adjustand correctly project the image based content being displayed by thedisplay 190 for close up viewing. The optical subassembly 214 caninclude one or more lenses, mirrors, or other optical devices.

The head-mountable device 100 can include a camera 150 for capturing aview of an environment external to the head-mountable device 100. Thecamera 150 can include an optical sensor, such as a photodiode or aphotodiode array. Additionally or alternatively, the camera 150 caninclude one or more of various types of optical sensors that arearranged in various configurations for detecting user inputs describedherein. The camera 150 may be configured to capture an image of a sceneor subject located within a field-of-view of the camera 150. The imagemay be stored in a digital file in accordance with any one of a numberof digital formats. In some embodiments, the head-mountable device 100includes a camera, which includes an image sensor formed from acharge-coupled device (CCD) and/or a complementarymetal-oxide-semiconductor (CMOS) device, a photovoltaic cell, a photoresistive component, a laser scanner, and the like. It will berecognized that a camera can include other motion sensing devices.

The head-mountable device 100 can include one or more sensors 140 (e.g.,eye sensor) for tracking features of the user wearing the head-mountabledevice 100. For example, such sensors can perform facial featuredetection, facial movement detection, facial recognition, eye tracking,user mood detection, user emotion detection, voice detection, etc. Forexample, an eye sensor can optically capture a view of an eye (e.g.,pupil) and determine a direction of a gaze of the user. Such eyetracking may be used to determine a location and/or direction ofinterest. Detection and/or amplification of sound can then be focused ifit is received from sources at such a location and/or along such adirection.

Head-mountable device 100 can include a battery 220, which can chargeand/or power components of the head-mountable device 100. The battery220 can also charge and/or power components connected to thehead-mountable device 100, such as a portable electronic device 202, asdiscussed further herein.

The head-mountable device 100 can include an input/output component 226,which can include any suitable component for connecting head-mountabledevice 100 to other devices. Suitable components can include, forexample, audio/video jacks, data connectors, or any additional oralternative input/output components. The input/output component 226 caninclude buttons, keys, or another feature that can act as a keyboard foroperation by the user. As such, the description herein relating tokeyboards can apply to keyboards, keys, and/or other input featuresintegrated on the head-mountable device 100. Such an input/outputcomponent 226 can be fixedly or removably attached to a main body of thehead-mountable device 100.

The head-mountable device 100 can include communications circuitry 228for communicating with one or more servers or other devices using anysuitable communications protocol. For example, communications circuitry228 can support Wi-Fi (e.g., a 802.11 protocol), Ethernet, Bluetooth,high frequency systems (e.g., 900 MHz, 2.4 GHz, and 5.6 GHzcommunication systems), infrared, TCP/IP (e.g., any of the protocolsused in each of the TCP/IP layers), HTTP, BitTorrent, FTP, RTP, RTSP,SSH, any other communications protocol, or any combination thereof.Communications circuitry 228 can also include an antenna fortransmitting and receiving electromagnetic signals.

The head-mountable device 100 can include the microphones 230 asdescribed herein. The microphone 230 can be operably connected to theprocessor 170 for detection of sound levels and communication ofdetections for further processing, as described further herein.

The head-mountable device 100 can include the speakers 212 as describedherein. The speakers 212 can be operably connected to the processor 170for control of speaker output, including sound levels, as describedfurther herein.

The head-mountable device 100 can include one or more other sensors.Such sensors can be configured to sense substantially any type ofcharacteristic such as, but not limited to, images, pressure, light,touch, force, temperature, position, motion, and so on. For example, thesensor can be a photodetector, a temperature sensor, a light or opticalsensor, an atmospheric pressure sensor, a humidity sensor, a magnet, agyroscope, an accelerometer, a chemical sensor, an ozone sensor, aparticulate count sensor, and so on. By further example, the sensor canbe a bio-sensor for tracking biometric characteristics, such as healthand activity metrics. Other user sensors can perform facial featuredetection, facial movement detection, facial recognition, eye tracking,user mood detection, user emotion detection, voice detection, etc.

The head-mountable device 100 can optionally connect to a portableelectronic device 202, which can provide certain functions. For the sakeof brevity, the portable electronic device 202 will not be described indetail in FIG. 2 . It should be appreciated, however, that the portableelectronic device 202 may be embodied in a variety of forms including avariety of features, all or some of which can be utilized by thehead-mountable device 100 (e.g., input/output, controls, processing,battery, etc.). The portable electronic device 202 can provide ahandheld form factor (e.g., small portable electronic device which islight weight, fits in a pocket, etc.). Although not limited to these,examples include media players, phones (including smart phones), PDAs,computers, and the like. The portable electronic device 202 may includea screen 213 for presenting the graphical portion of the media to theuser. The screen 213 can be utilized as the primary screen of thehead-mountable device 100.

The head-mountable device 100 can include a dock 206 operative toreceive the portable electronic device 202. The dock 206 can include aconnector (e.g., Lightning, USB, FireWire, power, DVI, etc.), which canbe plugged into a complementary connector of the portable electronicdevice 202. The dock 206 may include features for helping to align theconnectors during engagement and for physically coupling the portableelectronic device 202 to the head-mountable device 100. For example, thedock 206 may define a cavity for placement of the portable electronicdevice 202. The dock 206 may also include retaining features forsecuring portable electronic device 202 within the cavity. The connectoron the dock 206 can function as a communication interface between theportable electronic device 202 and the head-mountable device 100.

Referring now to FIG. 3 , a user can wear and/or operate ahead-mountable device that provides visual outputs based on audioinputs. As shown in FIG. 3 , a user 10 can wear the head-mountabledevice 100, which provides a field-of-view 90 and external environment.A source 20 of a sound 30 can be located within the field-of-view 90.Other sources of sounds can also be located within the field-of-view 90and/or outside the field-of-view 90. As each of the sounds are receivedby the user, the head-mountable device 100 can provide visual outputsthat guide the user's attention to particular sources of sound.

Referring now to FIG. 4 , the display 190 of the head-mountable device100 can provide a view of the external environment, including a source20 of the sound. One or more of the displayed items in the view of thedisplay 190 can correspond to physical objects in an environment. Forexample, a camera of the head-mountable device 100 can capture a view ofthe external environment. Based on the captured view, the display 190can provide a display that includes images of the physical objects.Additionally or alternatively, the display 190 can provide a display ofvirtual objects that correspond to physical objects in the externalenvironment. For example, recognized objects can be rendered as virtualobjects having features (e.g., position, orientation, color, size, etc.)that are based on detections of the physical objects in the externalenvironment. Additionally or alternatively, the display 190 can providea display of virtual objects that do not correspond to physical objectsin the external environment. For example, other objects can be renderedas virtual objects even when no corresponding physical objects arepresent. Accordingly, it will be recognized that the view can include aview of physical objects and virtual objects.

As shown in FIG. 4 , the display 190 can identify a source 20 of adetected sound as having a particular location (e.g., direction oforigin) with respect to the head-mountable device 100. Suchdeterminations can be performed by an array of microphones, as discussedherein. Upon determination of the location of the source 20, thecorresponding location on the display 190 can also be determined basedon a known spatial relationship between the microphones and the display190 of the head-mountable device 100. As further shown in FIG. 4 , anindicator 300 can be visually output by the display 190 to indicate thelocation of the source 20. Such an output can help the user visuallyidentify the location of the source 20 even when the user is unable todirectly identify the location-based on the user's own detection of thesound.

The indicator 300 can include an icon, symbol, graphic, text, word,number, character, picture, or other visible feature that can bedisplayed at, on, and/or near the source 20 as displayed on the display190. For example, the indicator 300 can correspond to a knowncharacteristic (e.g., identity, name, color, etc.) of the source 20.Additionally or alternatively, the indicator 300 can include visualfeatures such as color, highlighting, glowing, outlines, shadows, orother contrasting features that allow portions thereof to be moredistinctly visible when displayed along with the view to the externalenvironment and/or objects therein. The indicator 300 can move acrossthe display 190 as the user moves the head-mountable device to changethe field-of-view being captured and/or displayed. For example, theindicator 300 can maintain its position with respect to the source 20 asthe source 20 moves within the display 190 due to the user's movement.

Referring now to FIG. 5 , a source 20 of a sound 30 can be locatedoutside of the field-of-view 90 provided by the head-mountable device100. Other sources of sounds can also be located within thefield-of-view 90 and/or outside the field-of-view 90. As each of thesounds are received by the user, the head-mountable device 100 canprovide visual outputs that guide the user's attention to particularsources of sound, even when such sources are outside of thefield-of-view.

Referring now to FIG. 6 , the display 190 of the head-mountable device100 can provide a view of the external environment, even when the viewdoes not include the source of the sound. One or more of the displayeditems in the view of the display 190 can correspond to physical objectsin an environment, as discussed herein. For example, a camera of thehead-mountable device 100 can capture a view of the externalenvironment.

As shown in FIG. 6 , the display 190 can identify a source of a detectedsound as having a particular location (e.g., direction of origin) withrespect to the head-mountable device 100. Such determinations can beperformed by an array of microphones, as discussed herein. Upondetermination of the location of the source 20, it can be furtherdetermined that the location of the source is not within a field-of-viewprovided by the display 190. Such a determination can be made based on aknown spatial relationship between the microphones and the display 190of the head-mountable device 100. As further shown in FIG. 6 , theindicator 300 can be visually output by the display 190 to indicate thelocation of the source even when the source is not displayed within thefield-of-view of the display 190. As such, the indicator 300 can suggestto the user the direction in which the user may change its positionand/or orientation to capture a view of the source. Such an output canhelp the user visually identify the location of the source even when theuser is unable to directly identify the location-based on the user's owndetection of the sound.

The indicator 300 can include an icon, symbol, graphic, text, word,number, character, picture, or other visible feature that can bedisplayed at, on, and/or near the portion of the display 190 that mostclosely corresponds to the location of the source. By further example,the indicator 300 can correspond to a known characteristic (e.g.,identity, name, color, etc.) of the source. Additionally oralternatively, the indicator 300 can include visual features such ascolor, highlighting, glowing, outlines, shadows, or other contrastingfeatures that allow portions thereof to be more distinctly visible whendisplayed along with the view to the external environment and/or objectstherein.

The indicator 300 can be provided at a portion of the display 190 thatis adjacent to an edge of the display 190. The edge can be one that isclosest to where the source would be provided if the field of view werelarge enough to include it. For example, the indicator 300 can beprovided at a portion of the display 190 that is along a pathwayextending from a center of the display 190 and in a direction from thecenter towards the source. By further example, the indicator 300 canindicate a direction in which the user can turn to bring the sourcewithin the field-of-view of the display 190. The indicator 300 canupdates its position on the display 190 as the user moves thehead-mountable device, so that the indicator 300 provides updatedsuggestions of the direction in which the user can turn to capture thesource within the field-of-view of the display 190. Additionally, whenthe source is brought within the field-of-view of the display 190, theindicator 300 can be provided as shown in FIG. 4 .

Referring now to FIG. 7 , a method of operating a head-mountable deviceis provided to achieve the results described herein. The method 700 canbe performed at least in part by a head-mountable device to provideaudio-based feedback with a display of a head-mountable device.Additionally or alternatively, at least some steps can be performed inpart by another device operatively connected to the head-mountabledevice. It will be understood that the method 700 illustrated in FIG. 7is merely an example, and that a method can be performed with additionalsteps and/or fewer steps than those illustrated in FIG. 7 .

In operation 702, a head-mountable device detects a sound with one ormore microphones. In operation 704, the location of the source of thesound is determined based on operation of the microphones. For example,the microphones can be directional and/or an array of omnidirectionalmicrophones that provide an ability to detect the direction of thesource with respect to the head-mountable device. Optionally, the sourcecan be determined to be within or outside a field-of-view of a displayof the head-mountable device. In operation 706, and indicator isdisplayed on a display of the head-mountable device. Where the source iswithin a field-of-view of the display of the head-mountable device, theindicator can be provided on the display at or near the source as outputon the display. Where the source is outside of a field-of-view of thedisplay of the head-mountable device, the indicator can be provided onthe display as described herein (e.g., to indicate a direction in whichthe user can turn to view the source).

Referring now to FIG. 8 , a method of operating a head-mountable deviceis provided to achieve the results described herein. The method 800 canbe performed at least in part by a head-mountable device to provideaudio-based feedback with a speaker of a head-mountable device.Additionally or alternatively, at least some steps can be performed inpart by another device operatively connected to the head-mountabledevice. It will be understood that the method 800 illustrated in FIG. 8is merely an example, and that a method can be performed with additionalsteps and/or fewer steps than those illustrated in FIG. 8 .

In operation 802, a head-mountable device determines a targetcharacteristic of a sound to be detected. The target characteristic canbe based on a user input. For example, the target characteristic can beselected (e.g., from a menu) and/or input by a user. The targetcharacteristic can be based on a user input in which the user selects apreviously recorded sound to form the basis for analysis of subsequentlydetected sounds. The target characteristic can be a frequency, volume(e.g., amplitude), location, type of source, and/or range of one or moreof the above. For example, a sound of a particular type can be targeted,so that the audio output of the head-mountable device is focused onsounds having such a target characteristic.

In operation 804, the head-mountable device detects a sound with one ormore microphones.

In operation 806, the head-mountable device modifies audio output of aspeaker thereof based on the target characteristic. For example, thehead-mountable device can compare the detected sound with the targetcharacteristic to determine whether the detected sound has the targetcharacteristic. If the detected sound is determined to have the targetcharacteristic, it can be amplified as an output of the speakers. Forexample, the audio output of the speakers can be controlled to amplifyaudio input that corresponds to the sounds that have the targetcharacteristic. By further example, audio input received by microphonesthat are directed to the qualifying sound (e.g., having the targetcharacteristic) can be amplified (e.g., volume increased) and audioinput received by microphones that are not directed to the qualifyingsound (e.g., lacking the target characteristic) can be reduced (e.g.,volume decreased).

The modified audio output can allow a user to focus on audio thatsatisfies the target characteristic, therefore allowing the user tofilter out audio that does not satisfy the target characteristic.Accordingly, the burden on the user to separate multiple sounds isreduced by focusing on sounds of interest.

Referring now to FIG. 9 , a method of operating a head-mountable deviceis provided to achieve the results described herein. The method 900 canbe performed at least in part by a head-mountable device to provideaudio-based feedback with a speaker of a head-mountable device.Additionally or alternatively, at least some steps can be performed inpart by another device operatively connected to the head-mountabledevice. It will be understood that the method 900 illustrated in FIG. 9is merely an example, and that a method can be performed with additionalsteps and/or fewer steps than those illustrated in FIG. 9 .

In operation 902, a head-mountable device determines a gaze of the user.For example, and eye sensor can be operated to determine the directionin which the user's eye (e.g., pupil) is directed. Such a gaze directioncan be understood to indicate the direction of the user's interest, andtherefore the locations from which the user desires to hear sounds.

In operation 904, the head-mountable device detects a sound with one ormore microphones.

In operation 906, the head-mountable device modifies audio output of aspeaker thereof based on the direction of the user's gaze. For example,the head-mountable device can compare the detected sound with thedirection of the user's gaze to determine whether the detected sound isfrom a source that is along the direction of the user's gaze. If thedetected sound is determined to be along the direction of the user'sgaze, it can be amplified as an output of the speakers. For example, theaudio output of the speakers can be controlled to amplify audio inputthat corresponds to the sounds that are along the direction of theuser's gaze. By further example, audio input received by microphonesthat are directed in the qualifying direction (e.g., the direction ofthe user's gaze) can be amplified (e.g., volume increased) and audioinput received by microphones that are not directed to the qualifyingdirection (e.g., other than in the direction of the user's gaze) can bereduced (e.g., volume decreased).

The modified audio output can allow a user to indicate the desired audiofocus by merely directing the gaze of the eye in the desired direction.Other sounds can be filtered out. Accordingly, the burden on the user toseparate multiple sounds is reduced by naturally focusing on sources ofinterest with eye gaze.

Accordingly, embodiments of the present disclosure provide ahead-mountable device with multiple microphones for directional audiodetection. The head-mountable device can also include a speaker foraudio output and/or a display for visual output. The head-mountabledevice can be configured to provide visual outputs based on audio inputsby displaying an indicator on a display based on a location of a sourceof a sound. The head-mountable device can be configured to audio outputsbased on audio inputs by modifying an audio output of the speaker basedon a detected sound and a target characteristic. Such characteristicscan be based on a direction of a gaze of the user, as detected by an eyesensor.

Various examples of aspects of the disclosure are described below asclauses for convenience. These are provided as examples, and do notlimit the subject technology.

Clause A: a head-mountable device comprising: multiple microphones; adisplay; a controller configured to perform the operations of: detectinga sound with the microphones; determining a location of a source of thesound with respect to the head-mountable device; and displaying anindicator on the display based on the location of the source.

Clause B: a head-mountable device comprising: multiple microphones; aspeaker; a controller configured to perform the operations of:determining a target characteristic; detecting a sound with themicrophones; comparing the sound to the target characteristic; andmodifying an audio output of the speaker based on the sound and thetarget characteristic.

Clause C: a head-mountable device comprising: multiple microphones; aspeaker; an eye sensor; a controller configured to: determining adirection of a gaze of a user based on the eye sensor; detecting a soundwith the microphones; and modifying an audio output of the speaker basedon the sound and the direction of the gaze.

One or more of the above clauses can include one or more of the featuresdescribed below. It is noted that any of the following clauses may becombined in any combination with each other, and placed into arespective independent clause, e.g., clause A, B, or C.

Clause 1: the location of the source is within a field-of-view providedby the display and the indicator is displayed at a portion of thedisplay that corresponds to the location of the source.

Clause 2: the location of the source is outside a field-of-view of thedisplay and the indicator is displayed at an edge of the display thatcorresponds to a direction of the location of the source relative to acenter of the display.

Clause 3: determining the location of the source of the sound withrespect to the head-mountable device comprises determining which one ofthe microphones is most closely directed toward the location of thesource.

Clause 4: the controller is further configured to perform the operationof determining a target characteristic; and determining the location ofthe source of the sound with respect to the head-mountable device isbased on the sound and the target characteristic.

Clause 5: the target characteristic is based on a user input thatidentifies the sound.

Clause 6: the indicator is based on a characteristic of the source ofthe sound.

Clause 7: a camera, wherein the display is configured to display a viewcaptured by the camera.

Clause 8: modifying the audio output of the speaker comprises increasinga volume of the audio output that is based on audio input from one ofthe microphones that is directed toward a location of a source of thesound.

Clause 9: the speaker is one of multiple speakers; and modifying theaudio output comprises increasing a volume of one of the speakers thatcorresponds to a direction that is toward a location of a source of thesound.

Clause 10: the target characteristic is based on a frequency of thesound.

Clause 11: the target characteristic is based on a user input thatidentifies the sound.

Clause 12: the target characteristic is based on facial recognition ofan individual.

Clause 13: determining the direction of the gaze of the user comprisesoptically capturing a view of an eye of the user with the eye sensor;

Clause 14: modifying the audio output of the speaker comprisesincreasing a volume of the audio output that is based on audio inputfrom one of the microphones that is directed toward the direction of thegaze.

Clause 15: the controller is further configured to determine that one ofthe microphones is more closely directed toward the direction of thegaze than others of the microphones; and

Clause 16: modifying the audio output of the speaker comprises:increasing a volume of the audio output that is based on audio inputfrom the one of the microphones; and decreasing a volume of the audiooutput that is based on audio input from the others of the microphones.

Clause 17: a camera; and a display configured to display a view capturedby the camera, wherein the direction of the gaze of the user extendsthrough the display.

As described above, one aspect of the present technology may include thegathering and use of data available from various sources. The presentdisclosure contemplates that in some instances, this gathered data mayinclude personal information data that uniquely identifies or can beused to contact or locate a specific person. Such personal informationdata can include demographic data, location-based data, telephonenumbers, email addresses, twitter ID's, home addresses, data or recordsrelating to a user's health or level of fitness (e.g., vital signsmeasurements, medication information, exercise information), date ofbirth, or any other identifying or personal information.

The present disclosure recognizes that the use of such personalinformation data, in the present technology, can be used to the benefitof users. For instance, health and fitness data may be used to provideinsights into a user's general wellness, or may be used as positivefeedback to individuals using technology to pursue wellness goals.

The present disclosure contemplates that the entities responsible forthe collection, analysis, disclosure, transfer, storage, or other use ofsuch personal information data will comply with well-established privacypolicies and/or privacy practices. In particular, such entities shouldimplement and consistently use privacy policies and practices that aregenerally recognized as meeting or exceeding industry or governmentalrequirements for maintaining personal information data private andsecure. Such policies should be easily accessible by users, and shouldbe updated as the collection and/or use of data changes. Personalinformation from users should be collected for legitimate and reasonableuses of the entity and not shared or sold outside of those legitimateuses. Further, such collection/sharing should occur after receiving theinformed consent of the users. Additionally, such entities shouldconsider taking any needed steps for safeguarding and securing access tosuch personal information data and ensuring that others with access tothe personal information data adhere to their privacy policies andprocedures. Further, such entities can subject themselves to evaluationby third parties to certify their adherence to widely accepted privacypolicies and practices. In addition, policies and practices should beadapted for the particular types of personal information data beingcollected and/or accessed and adapted to applicable laws and standards,including jurisdiction-specific considerations. For instance, in the US,collection of or access to certain health data may be governed byfederal and/or state laws, such as the Health Insurance Portability andAccountability Act (HIPAA); whereas health data in other countries maybe subject to other regulations and policies and should be handledaccordingly. Hence different privacy practices should be maintained fordifferent personal data types in each country.

Despite the foregoing, the present disclosure also contemplatesembodiments in which users selectively block the use of, or access to,personal information data. That is, the present disclosure contemplatesthat hardware and/or software elements can be provided to prevent orblock access to such personal information data. For example, in the caseof advertisement delivery services, the present technology can beconfigured to allow users to select to “opt in” or “opt out” ofparticipation in the collection of personal information data duringregistration for services or anytime thereafter. In another example,users can select not to provide mood-associated data for targetedcontent delivery services. In yet another example, users can select tolimit the length of time mood-associated data is maintained or entirelyprohibit the development of a baseline mood profile. In addition toproviding “opt in” and “opt out” options, the present disclosurecontemplates providing notifications relating to the access or use ofpersonal information. For instance, a user may be notified upondownloading an app that their personal information data will be accessedand then reminded again just before personal information data isaccessed by the app.

Moreover, it is the intent of the present disclosure that personalinformation data should be managed and handled in a way to minimizerisks of unintentional or unauthorized access or use. Risk can beminimized by limiting the collection of data and deleting data once itis no longer needed. In addition, and when applicable, including incertain health related applications, data de-identification can be usedto protect a user's privacy. De-identification may be facilitated, whenappropriate, by removing specific identifiers (e.g., date of birth,etc.), controlling the amount or specificity of data stored (e.g.,collecting location data a city level rather than at an address level),controlling how data is stored (e.g., aggregating data across users),and/or other methods.

Therefore, although the present disclosure broadly covers use ofpersonal information data to implement one or more various disclosedembodiments, the present disclosure also contemplates that the variousembodiments can also be implemented without the need for accessing suchpersonal information data. That is, the various embodiments of thepresent technology are not rendered inoperable due to the lack of all ora portion of such personal information data. For example, content can beselected and delivered to users by inferring preferences based onnon-personal information data or a bare minimum amount of personalinformation, such as the content being requested by the deviceassociated with a user, other non-personal information available to thecontent delivery services, or publicly available information.

A reference to an element in the singular is not intended to mean oneand only one unless specifically so stated, but rather one or more. Forexample, “a” module may refer to one or more modules. An elementproceeded by “a,” “an,” “the,” or “said” does not, without furtherconstraints, preclude the existence of additional same elements.

Headings and subheadings, if any, are used for convenience only and donot limit the invention. The word exemplary is used to mean serving asan example or illustration. To the extent that the term include, have,or the like is used, such term is intended to be inclusive in a mannersimilar to the term comprise as comprise is interpreted when employed asa transitional word in a claim. Relational terms such as first andsecond and the like may be used to distinguish one entity or action fromanother without necessarily requiring or implying any actual suchrelationship or order between such entities or actions.

Phrases such as an aspect, the aspect, another aspect, some aspects, oneor more aspects, an implementation, the implementation, anotherimplementation, some implementations, one or more implementations, anembodiment, the embodiment, another embodiment, some embodiments, one ormore embodiments, a configuration, the configuration, anotherconfiguration, some configurations, one or more configurations, thesubject technology, the disclosure, the present disclosure, othervariations thereof and alike are for convenience and do not imply that adisclosure relating to such phrase(s) is essential to the subjecttechnology or that such disclosure applies to all configurations of thesubject technology. A disclosure relating to such phrase(s) may apply toall configurations, or one or more configurations. A disclosure relatingto such phrase(s) may provide one or more examples. A phrase such as anaspect or some aspects may refer to one or more aspects and vice versa,and this applies similarly to other foregoing phrases.

A phrase “at least one of” preceding a series of items, with the terms“and” or “or” to separate any of the items, modifies the list as awhole, rather than each member of the list. The phrase “at least one of”does not require selection of at least one item; rather, the phraseallows a meaning that includes at least one of any one of the items,and/or at least one of any combination of the items, and/or at least oneof each of the items. By way of example, each of the phrases “at leastone of A, B, and C” or “at least one of A, B, or C” refers to only A,only B, or only C; any combination of A, B, and C; and/or at least oneof each of A, B, and C.

It is understood that the specific order or hierarchy of steps,operations, or processes disclosed is an illustration of exemplaryapproaches. Unless explicitly stated otherwise, it is understood thatthe specific order or hierarchy of steps, operations, or processes maybe performed in different order. Some of the steps, operations, orprocesses may be performed simultaneously. The accompanying methodclaims, if any, present elements of the various steps, operations orprocesses in a sample order, and are not meant to be limited to thespecific order or hierarchy presented. These may be performed in serial,linearly, in parallel or in different order. It should be understoodthat the described instructions, operations, and systems can generallybe integrated together in a single software/hardware product or packagedinto multiple software/hardware products.

In one aspect, a term coupled or the like may refer to being directlycoupled. In another aspect, a term coupled or the like may refer tobeing indirectly coupled.

Terms such as top, bottom, front, rear, side, horizontal, vertical, andthe like refer to an arbitrary frame of reference, rather than to theordinary gravitational frame of reference. Thus, such a term may extendupwardly, downwardly, diagonally, or horizontally in a gravitationalframe of reference.

The disclosure is provided to enable any person skilled in the art topractice the various aspects described herein. In some instances,well-known structures and components are shown in block diagram form inorder to avoid obscuring the concepts of the subject technology. Thedisclosure provides various examples of the subject technology, and thesubject technology is not limited to these examples. Variousmodifications to these aspects will be readily apparent to those skilledin the art, and the principles described herein may be applied to otheraspects.

All structural and functional equivalents to the elements of the variousaspects described throughout the disclosure that are known or later cometo be known to those of ordinary skill in the art are expresslyincorporated herein by reference and are intended to be encompassed bythe claims. Moreover, nothing disclosed herein is intended to bededicated to the public regardless of whether such disclosure isexplicitly recited in the claims. No claim element is to be construedunder the provisions of 35 U.S.C. § 112, sixth paragraph, unless theelement is expressly recited using the phrase “means for” or, in thecase of a method claim, the element is recited using the phrase “stepfor”.

The title, background, brief description of the drawings, abstract, anddrawings are hereby incorporated into the disclosure and are provided asillustrative examples of the disclosure, not as restrictivedescriptions. It is submitted with the understanding that they will notbe used to limit the scope or meaning of the claims. In addition, in thedetailed description, it can be seen that the description providesillustrative examples and the various features are grouped together invarious implementations for the purpose of streamlining the disclosure.The method of disclosure is not to be interpreted as reflecting anintention that the claimed subject matter requires more features thanare expressly recited in each claim. Rather, as the claims reflect,inventive subject matter lies in less than all features of a singledisclosed configuration or operation. The claims are hereby incorporatedinto the detailed description, with each claim standing on its own as aseparately claimed subject matter.

The claims are not intended to be limited to the aspects describedherein, but are to be accorded the full scope consistent with thelanguage of the claims and to encompass all legal equivalents.Notwithstanding, none of the claims are intended to embrace subjectmatter that fails to satisfy the requirements of the applicable patentlaw, nor should they be interpreted in such a way.

What is claimed is:
 1. A head-mountable device comprising: multiplemicrophones configured to detect a sound and a location of an objectproducing the sound; an eye sensor configured to detect a direction ofan eye gaze; a speaker configured to provide an audio output; and acontroller configured to perform operations comprising: determiningwhether the location of the object is in the direction of the eye gaze;and when the location of the object is in the direction of the eye gaze,amplifying the sound with the audio output of the speaker.
 2. Thehead-mountable device of claim 1, wherein the controller is furtherconfigured to, when the location of the object is not in the directionof the eye gaze, reduce a volume of the sound with the audio output ofthe speaker.
 3. The head-mountable device of claim 1, wherein themultiple microphones are further configured to detect an additionalsound, wherein the controller is further configured to, when a locationof a source of the additional sound is not in the direction of the eyegaze, filter the additional sound.
 4. The head-mountable device of claim1, further comprising: a camera configured to capture an image; and adisplay configured to output a view based on the image captured by thecamera, wherein the object is a physical object captured within theimage.
 5. The head-mountable device of claim 1, wherein the multiplemicrophones are configured to detect the location of the object bydetermining which one of the microphones is most closely directed towardthe object.
 6. A head-mountable device comprising: a display configuredto output a view including an object; an eye sensor configured to detecta direction of an eye gaze with respect to the view; a speaker; and acontroller configured to perform operations comprising: determiningwhether the object in the view is in the direction of the eye gaze; andwhen the object is in the direction of the eye gaze, providing an audiooutput, with the speaker, corresponding to the object.
 7. Thehead-mountable device of claim 6, further comprising a camera configuredto capture an image, the display is further configured to output theview based on the image captured by the camera, and the object is aphysical object captured within the image.
 8. The head-mountable deviceof claim 6, further comprising multiple microphones configured to detecta sound produced by the object and a location of the object, whereinproviding the audio output corresponding to the object comprisesamplifying the sound produced by the object.
 9. The head-mountabledevice of claim 8, wherein the multiple microphones are configured todetect the location of the object by determining which one of themicrophones is most closely directed toward the object.
 10. Thehead-mountable device of claim 8, wherein the controller is furtherconfigured to, when the location of the object is not in the directionof the eye gaze, reduce a volume of the sound with the audio output ofthe speaker.
 11. The head-mountable device of claim 8, wherein themultiple microphones are further configured to detect an additionalsound, wherein the controller is further configured to, when a locationof a source of the additional sound is not in the direction of the eyegaze, filter the additional sound.
 12. The head-mountable device ofclaim 6, wherein the object is a virtual object.
 13. A head-mountabledevice comprising: a camera configured to capture an image; a displayconfigured to output a view based on the image captured by the camera;an eye sensor configured to detect a direction of an eye gaze; aspeaker; and a controller configured to perform operations comprising:outputting, with the display, an indicator corresponding to an object;determining whether the object is in the direction of the eye gaze; andwhen the object is on the display and in the direction of the eye gaze,providing an audio output, with the speaker, corresponding to theobject.
 14. The head-mountable device of claim 13, wherein the object isa physical object captured within a field-of-view of the camera.
 15. Thehead-mountable device of claim 13, wherein, when the object is withinthe view output by the display, the indicator is displayed at a portionof the display corresponding to the object.
 16. The head-mountabledevice of claim 13, wherein, when the object is located outside the viewoutput by the display, the indicator is displayed at an edge of thedisplay corresponding to a direction of a location of the objectrelative to a center of the display.
 17. The head-mountable device ofclaim 13, wherein the indicator corresponds to a characteristic of theobject.
 18. The head-mountable device of claim 13, further comprisingmultiple microphones configured to detect a sound produced by the objectand a location of the object, wherein providing the audio outputcorresponding to the object comprises amplifying the sound produced bythe object.
 19. The head-mountable device of claim 18, wherein themultiple microphones are configured to detect the location of the objectby determining which one of the microphones is most closely directedtoward the object.
 20. The head-mountable device of claim 13, whereinthe object is a virtual object.